Memory Bandwith Based Performance Tuning Prediction Memory-bandwidth Based Performance Tuning and Prediction
نویسندگان
چکیده
It is the contention of this paper that memory bandwidth has become the single most important determinant of performance on modern computer systems built from commodity processors. This contention is supported by a study of several representative scientiic programs executed on the Silicon Graphics Origin2000, in which memory bandwidth is far more critical to performance than CPU speed or bandwidths between diierent levels of cache. This result leads us to conjecture that measuring and predicting data transfer time between memory and cache could be an eeective way to predict and tune program performance. We present the design of a performance tool based on this idea. This tool could use bandwidth analysis to predict program performance and locate memory hierarchy performance problems within an application. The eeectiveness of such a tool was demonstrated on the NAS SP benchmark. In the tuning step, this approach identiied three loop nests with low memory hierarchy performance that could be modiied through simple transformations to achieve a 19% overall performance improvement. We hand-applied the static estimation method to two subroutines and found that the diierence between the estimated capacity misses and the actual number of cache misses is less than 3%. Given accurate static estimation, this approach can predict SP running time with an error of less than 10%.
منابع مشابه
Bandwidth-Based Performance Tuning and Prediction
As the speed gap widens between CPU and memory, memory hierarchy performance has become the bottleneck for most applications. This is due in part to the difficulty of fully utilizing the deep and complex memory hierarchies found on most modern machines. In the past, various tools on performance tuning and prediction have been developed to improve machine utilization. However, these tools are no...
متن کاملA toolkit for optimising parallel performance
Three interacting tools to assist distributed memory programmers in developing, optimising and understanding application performance have been developed. These tools perform automatic code generation from an initial workload speciication, performance prediction using memory hierarchy simulation, and performance visualisation for distributed memory message passing applications. Their combination...
متن کاملVisualization and Performance Prediction of Multithreaded Solaris Programs by Tracing Kernel Threads
Efficient performance tuning of parallel programs is often hard. We present a performance prediction and visualization tool called VPPB. Based on a monitored uni-processor execution, VPPB shows the (predicted) behaviour of a multithreaded program using any number of processors and the program behaviour is visualized as a graph. The first version of VPPB was unable to handle I/O operations. This...
متن کاملAdaptive Simplified Model Predictive Control with Tuning Considerations
Model predictive controller is widely used in industrial plants. Uncertainty is one of the critical issues in real systems. In this paper, the direct adaptive Simplified Model Predictive Control (SMPC) is proposed for unknown or time varying plants with uncertainties. By estimating the plant step response in each sample, the controller is designed and the controller coefficients are directly ca...
متن کاملCompile Time Modeling of Off-Chip Memory Bandwidth for Parallel Loops
In this paper, we present a statistical model to predict the off-chip memory bandwidth required by a parallel loop during its execution. It is a compile-time modeling technique that derives the correlations between memory bandwidth requirement and data access patterns of multithreaded applications. This model could be used by the compiler and performance tools to predict when the sustainable me...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998